A Fast and Symmetric DUST Implementation to Mask Low-Complexity DNA Sequences

نویسندگان

  • Aleksandr Morgulis
  • E. Michael Gertz
  • Alejandro A. Schäffer
  • Richa Agarwala
چکیده

The DUST module has been used within BLAST for many years to mask low-complexity sequences. In this paper, we present a new implementation of the DUST module that uses the same function to assign a complexity score to a sequence, but uses a different rule by which high-scoring sequences are masked. The new rule masks every nucleotide masked by the old rule and occasionally masks more. The new masking rule corrects two related deficiencies with the old rule. First, the new rule is symmetric with respect to reversing the sequence. Second, the new rule is not context sensitive; the decision to mask a subsequence does not depend on what sequences flank it. The new implementation is at least four times faster than the old on the human genome. We show that both the percentage of additional bases masked and the effect on MegaBLAST outputs are very small.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

This paper presents two efficient implementations of fast and pipelined bit-parallel polynomial basis multipliers over GF (2m) by irreducible pentanomials and trinomials. The architecture of the first multiplier is based on a parallel and independent computation of powers of the polynomial variable. In the second structure only even powers of the polynomial variable are used. The par...

متن کامل

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

Origin of dust pollution particulate matter less than 2.5 micron in Mashhad city using HYSPLIT and DREAM8b model

Background and Objective: The aim of this study was to investigate dust origin particulate (PM2.5) in Mashhad city in a long period of time (2014-2019) based on unhealthy days. Furthermore, changes in meteorological parameters and their relationship with dust storms have also been investigated. Materials and Methods: In order to locate dust pollution hotspots in mashhad air, first, information...

متن کامل

FORMULATION AND PREPARATION OF A NOVEL DUST FREE FAST SETTING DENTAL ALGINATE IMPRESSION MATERIAL

Abstract: A novel dust free alginate impression material was formulated and prepared, comprising an alginate polyvinylpyrrolidone and tetraflouroethylene resins, a mixture of liquid paraffin and dimethylpolysiloxane oil as the dust generation controlling agents and processed diatomaceous earth filler which was obtained from Iranian ore. No dusting was detected during the mixing of the powde...

متن کامل

Rapid purification of HU protein from Halobacillus karajensis

The histone-like protein HU is the most-abundant DNA-binding protein in bacteria. The HU protein non-specifically binds and bends DNA as a hetero- or homodimer, and can participate in DNA supercoiling and DNA condensation. It also takes part in DNA functions such as replication, recombination, and repair. HU does not recognize any specific sequences but shows a certain degree of specificity to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 13 5  شماره 

صفحات  -

تاریخ انتشار 2006